71 research outputs found

    Less than meets the eye: the diagnostic information for visual categorization

    Get PDF
    Current theories of visual categorization are cast in terms of information processing mechanisms that use mental representations. However, the actual information contents of these representations are rarely characterized, which in turn hinders knowledge of mechanisms that use them. In this thesis, I identified these contents by extracting the information that supports behavior under given tasks - i.e., the task-specific diagnostic information. In the first study (Chapter 2), I modelled the diagnostic face information for familiar face identification, using a unique generative model of face identity information combined with perceptual judgments and reverse correlation. I then demonstrated the validity of this information using everyday perceptual tasks that generalize face identity and resemblance judgments to new viewpoints, age, and sex with a new group of participants. My results showed that human participants represent only a proportion of the objective identity information available, but what they do represent is both sufficiently detailed and versatile to generalize face identification across diverse tasks successfully. In the second study (Chapter 3), I modelled the diagnostic facial movement for facial expressions of emotion recognition. I used the models that characterize the mental representations of six facial expressions of emotion (Happy, Surprise, Fear, Anger, Disgust, and Sad) in individual observers. I validated them on a new group of participants. With the validated models, I derived main signal variants for each emotion and their probabilities of occurrence within each emotion. Using these variants and their probability, I trained a Bayesian classifier and showed that the Bayesian classifier mimics human observersā€™ categorization performance closely. My results demonstrated that such emotion variants and their probabilities of occurrence comprise observersā€™ mental representations of facial expressions of emotion. In the third study (Chapter 4), I investigated how the brain reduces high dimensional visual input into low dimensional diagnostic representations to support a scene categorization. To do so, I used an information theoretic framework called Contentful Brain and Behavior Imaging (CBBI) to tease apart stimulus information that supports behavior (i.e., diagnostic) from that which does not (i.e., nondiagnostic). I then tracked the dynamic representations of both in magneto-encephalographic (MEG) activity. Using CBBI, I demonstrated a rapid (~170 ms) reduction of nondiagnostic information occurs in the occipital cortex and the progression of diagnostic information into right fusiform gyrus where they are constructed to support distinct behaviors. My results highlight how CBBI can be used to investigate the information processing from brain activity by considering interactions between three variables (stimulus information, brain activity, behavior), rather than just two, as is the current norm in neuroimaging studies. I discussed the task-specific diagnostic information as individualsā€™ dynamic and experienced-based representation about the physical world, which provides us the much-needed information to search and understand the black box of high-dimensional, deep and biological brain networks. I also discussed the practical concerns about using the data-driven approach to uncover diagnostic information

    Reverse Engineering Psychologically Valid Facial Expressions of Emotion into Social Robots

    Get PDF
    Social robots are now part of human society, destined for schools, hospitals, and homes to perform a variety of tasks. To engage their human users, social robots must be equipped with the essential social skill of facial expression communication. Yet, even state-of-the-art social robots are limited in this ability because they often rely on a restricted set of facial expressions derived from theory with well-known limitations such as lacking naturalistic dynamics. With no agreed methodology to objectively engineer a broader variance of more psychologically impactful facial expressions into the social robots' repertoire, human-robot interactions remain restricted. Here, we address this generic challenge with new methodologies that can reverse-engineer dynamic facial expressions into a social robot head. Our data-driven, user-centered approach, which combines human perception with psychophysical methods, produced highly recognizable and human-like dynamic facial expressions of the six classic emotions that generally outperformed state-of-art social robot facial expressions. Our data demonstrates the feasibility of our method applied to social robotics and highlights the benefits of using a data-driven approach that puts human users as central to deriving facial expressions for social robots. We also discuss future work to reverse-engineer a wider range of socially relevant facial expressions including conversational messages (e.g., interest, confusion) and personality traits (e.g., trustworthiness, attractiveness). Together, our results highlight the key role that psychology must continue to play in the design of social robots

    Modelling Face Memory Reveals Task-generalizable Representations

    Get PDF
    Current cognitive theories are cast in terms of information-processing mechanisms that use mental representations. For example, people use their mental representations to identify familiar faces under various conditions of pose, illumination and ageing, or to draw resemblance between family members. Yet, the actual information contents of these representations are rarely characterized, which hinders knowledge of the mechanisms that use them. Here, we modelled the three-dimensional representational contents of 4 faces that were familiar to 14 participants as work colleagues. The representational contents were created by reverse-correlating identity information generated on each trial with judgements of the faceā€™s similarity to the individual participantā€™s memory of this face. In a second study, testing new participants, we demonstrated the validity of the modelled contents using everyday face tasks that generalize identity judgements to new viewpoints, age and sex. Our work highlights that such models of mental representations are critical to understanding generalization behaviour and its underlying information-processing mechanisms

    Dynamic Construction of Reduced Representations in the Brain for Perceptual Decision Behavior

    Get PDF
    Summary: Over the past decade, extensive studies of the brain regions that support face, object, and scene recognition suggest that these regions have a hierarchically organized architecture that spans the occipital and temporal lobes [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14], where visual categorizations unfold over the first 250 ms of processing [15, 16, 17, 18, 19]. This same architecture is flexibly involved in multiple tasks that require task-specific representationsā€”e.g. categorizing the same object as ā€œa carā€ or ā€œa Porsche.ā€ While we partly understand where and when these categorizations happen in the occipito-ventral pathway, the next challenge is to unravel how these categorizations happen. That is, how does high-dimensional input collapse in the occipito-ventral pathway to become low dimensional representations that guide behavior? To address this, we investigated what information the brain processes in a visual perception task and visualized the dynamic representation of this information in brain activity. To do so, we developed stimulus information representation (SIR), an information theoretic framework, to tease apart stimulus information that supports behavior from that which does not. We then tracked the dynamic representations of both in magneto-encephalographic (MEG) activity. Using SIR, we demonstrate that a rapid (āˆ¼170 ms) reduction of behaviorally irrelevant information occurs in the occipital cortex and that representations of the information that supports distinct behaviors are constructed in the right fusiform gyrus (rFG). Our results thus highlight how SIR can be used to investigate the component processes of the brain by considering interactions between three variables (stimulus information, brain activity, behavior), rather than just two, as is the current norm

    Revealing the information contents of memory within the stimulus information representation framework

    Get PDF
    The information contents of memory are the cornerstone of the most influential models in cognition. To illustrate, consider that in predictive coding, a prediction implies that specific information is propagated down from memory through the visual hierarchy. Likewise, recognizing the input implies that sequentially accrued sensory evidence is successfully matched with memorized information (categorical knowledge). Although the existing models of prediction, memory, sensory representation and categorical decision are all implicitly cast within an information processing framework, it remains a challenge to precisely specify what this information is, and therefore where, when and how the architecture of the brain dynamically processes it to produce behaviour. Here, we review a framework that addresses these challenges for the studies of perception and categorizationā€“stimulus information representation (SIR). We illustrate how SIR can reverse engineer the information contents of memory from behavioural and brain measures in the context of specific cognitive tasks that involve memory. We discuss two specific lessons from this approach that generally apply to memory studies: the importance of task, to constrain what the brain does, and of stimulus variations, to identify the specific information contents that are memorized, predicted, recalled and replayed

    CASM-AMFMNet: A Network based on Coordinate Attention Shuffle Mechanism and Asymmetric Multi-Scale Fusion Module for Classification of Grape Leaf Diseases

    Get PDF
    Grape disease is a significant contributory factor to the decline in grape yield, typically affecting the leaves first. Efficient identification of grape leaf diseases remains a critical unmet need. To mitigate background interference in grape leaf feature extraction and improve the ability to extract small disease spots, by combining the characteristic features of grape leaf diseases, we developed a novel method for disease recognition and classification in this study. First, Gaussian filters Sobel smooth de-noising Laplace operator (GSSL) was employed to reduce image noise and enhance the texture of grape leaves. A novel network designated coordinated attention shuffle mechanism-asymmetric multi-scale fusion module net (CASM-AMFMNet) was subsequently applied for grape leaf disease identification. CoAtNet was employed as the network backbone to improve model learning and generalization capabilities, which alleviated the problem of gradient explosion to a certain extent. The CASM-AMFMNet was further utilized to capture and target grape leaf disease areas, therefore reducing background interference. Finally, Asymmetric multi-scale fusion module (AMFM) was employed to extract multi-scale features from small disease spots on grape leaves for accurate identification of small target diseases. The experimental results based on our self-made grape leaf image dataset showed that, compared to existing methods, CASM-AMFMNet achieved an accuracy of 95.95%, F1 score of 95.78%, and mAP of 90.27%. Overall, the model and methods proposed in this report could successfully identify different diseases of grape leaves and provide a feasible scheme for deep learning to correctly recognize grape diseases during agricultural production that may be used as a reference for other crops diseases

    CASM-AMFMNet: A Network based on Coordinate Attention Shuffle Mechanism and Asymmetric Multi-Scale Fusion Module for Classification of Grape Leaf Diseases

    Get PDF
    Grape disease is a significant contributory factor to the decline in grape yield, typically affecting the leaves first. Efficient identification of grape leaf diseases remains a critical unmet need. To mitigate background interference in grape leaf feature extraction and improve the ability to extract small disease spots, by combining the characteristic features of grape leaf diseases, we developed a novel method for disease recognition and classification in this study. First, Gaussian filters Sobel smooth de-noising Laplace operator (GSSL) was employed to reduce image noise and enhance the texture of grape leaves. A novel network designated coordinated attention shuffle mechanism-asymmetric multi-scale fusion module net (CASM-AMFMNet) was subsequently applied for grape leaf disease identification. CoAtNet was employed as the network backbone to improve model learning and generalization capabilities, which alleviated the problem of gradient explosion to a certain extent. The CASM-AMFMNet was further utilized to capture and target grape leaf disease areas, therefore reducing background interference. Finally, Asymmetric multi-scale fusion module (AMFM) was employed to extract multi-scale features from small disease spots on grape leaves for accurate identification of small target diseases. The experimental results based on our self-made grape leaf image dataset showed that, compared to existing methods, CASM-AMFMNet achieved an accuracy of 95.95%, F1 score of 95.78%, and mAP of 90.27%. Overall, the model and methods proposed in this report could successfully identify different diseases of grape leaves and provide a feasible scheme for deep learning to correctly recognize grape diseases during agricultural production that may be used as a reference for other crops diseases
    • ā€¦
    corecore